{ "cells": [ { "cell_type": "markdown", "id": "69a9cc05", "metadata": {}, "source": [ "This is part 1 of a tutorial series. We recommend following them in order, starting with [Part 0: Welcome to `musica`](0.%20Welcome%20to%20MUSICA.ipynb)." ] }, { "cell_type": "markdown", "id": "3a9ef243", "metadata": {}, "source": [ "# Latin Hypercube Sampling in `musica`\n", "\n", "In previous tutorials, we created models for one- and two-grid-cell systems. This time around, we'll do 100!\n", "As you'll see, the code changes little in terms of creating the chemical system, solver, and state, and using them\n", "to advance the chemical systems.\n", "\n", "But, how to set conditions for so many grid cells? Instead of hand-writing them or using purely random values,\n", "we'll use a technique called Latin Hypercube Sampling (LHS), a statistical method for generating multidimensional random samples.\n", "LHS avoids the problem of clustering that can sometimes appear in pure random sampling, and is often used for sensitivity studies.\n", "\n", "This tutorial will go over a simple example of utilizing LHS to run a multi-grid-cell solver in MUSICA.\n", "\n", "For more information on Latin Hypercube Sampling, see [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.qmc.LatinHypercube.html)." ] }, { "cell_type": "markdown", "id": "d50c8f23", "metadata": {}, "source": [ "## 1. Importing Libraries\n", "Below is a list of the required libraries for this tutorial:" ] }, { "cell_type": "code", "execution_count": 9, "id": "5c595ccd", "metadata": {}, "outputs": [], "source": [ "import musica\n", "import musica.mechanism_configuration as mc\n", "import matplotlib.pyplot as plt\n", "from scipy.stats import qmc\n", "import pandas as pd\n", "import numpy as np\n", "import seaborn as sns\n", "pd.set_option('display.float_format', str) # This is done to make the arrays more readable\n", "np.set_printoptions(suppress=True) # This is done to make the arrays more readable" ] }, { "cell_type": "markdown", "id": "7f80a7cf", "metadata": {}, "source": [ "## 2. System, Solver, and State Creation\n", "\n", "We'll define the same chemical system we used in the [Multiple Grid Cells Tutorial](1.%20multiple_grid_cells.ipynb), and use that to create a solver and a 100-grid-cell state." ] }, { "cell_type": "code", "execution_count": 10, "id": "8724138e", "metadata": {}, "outputs": [], "source": [ "A = mc.Species(name=\"A\")\n", "B = mc.Species(name=\"B\")\n", "C = mc.Species(name=\"C\")\n", "species = [A, B, C]\n", "gas = mc.Phase(name=\"gas\", species=species)\n", "\n", "r1 = mc.Arrhenius(\n", " name=\"A_to_B\",\n", " A=4.0e-3,\n", " C=50,\n", " reactants=[A],\n", " products=[B],\n", " gas_phase=gas\n", ")\n", "\n", "r2 = mc.Arrhenius(\n", " name=\"B_to_C\",\n", " A=4.0e-3,\n", " C=50, \n", " reactants=[B],\n", " products=[C],\n", " gas_phase=gas\n", ")\n", "\n", "mechanism = mc.Mechanism(\n", " name=\"musica_micm_example\",\n", " species=species,\n", " phases=[gas],\n", " reactions=[r1, r2]\n", ")\n", "\n", "solver = musica.MICM(mechanism = mechanism, solver_type = musica.SolverType.rosenbrock_standard_order)\n", "\n", "num_grid_cells = 100\n", "state = solver.create_state(num_grid_cells)" ] }, { "cell_type": "markdown", "id": "99119268", "metadata": {}, "source": [ "## 3. Creating and Scaling a Latin Hypercube Sampler\n", "\n", "This Latin Hypercube Sampler (LHS) uses the same five dimensions as the [Multiple Grid Cells Tutorial](1.%20multiple_grid_cells.ipynb) to randomize each of the individual systems. Those five dimensions are:\n", "* temperature (Kelvin),\n", "* pressure (Pascals), and\n", "* the concentrations of each of the species (mol m-3).\n", "\n", "Next, the LHS is created with the provided number of dimensions with a randomized sample that will be scaled by the sampler.\n", "The upper and lower bounds for each of the five dimensions are then set, and the sample is scaled with those bounds by the LHS.\n", "Do note as before that the ordering inside the bounding arrays matters and cannot be changed." ] }, { "cell_type": "code", "execution_count": 11, "id": "773d1802", "metadata": {}, "outputs": [], "source": [ "ndim = 5\n", "nsamples = num_grid_cells\n", "\n", "# Create a Latin Hypercube sampler in the unit hypercube\n", "sampler = qmc.LatinHypercube(d=ndim)\n", "\n", "# Generate samples\n", "sample = sampler.random(n=nsamples)\n", "\n", "# Define bounds for each dimension (temperature, pressure, and concentrations of A, B, C)\n", "l_bounds = [275, 100753.3, 0, 0, 0] # Lower bounds\n", "u_bounds = [325, 101753.3, 10, 10, 10] # Upper bounds\n", "\n", "# Scale the samples to the defined bounds\n", "sample_scaled = qmc.scale(sample, l_bounds, u_bounds)" ] }, { "cell_type": "markdown", "id": "82364d1b", "metadata": {}, "source": [ "## 4. Setting Conditions and Running the Solver\n", "\n", "Instead of hand-writing the initial conditions, as in previous tutorials, we'll use the output of the LHS to set our `state` values. Then, we'll solve the system over 600 s." ] }, { "cell_type": "code", "execution_count": 12, "id": "4ae18af9", "metadata": {}, "outputs": [], "source": [ "temperatures = sample_scaled[:, 0]\n", "pressures = sample_scaled[:, 1]\n", "concentrations = {\n", " \"A\": [],\n", " \"B\": [],\n", " \"C\": []\n", "}\n", "concentrations[\"A\"] = sample_scaled[:, 2]\n", "concentrations[\"B\"] = sample_scaled[:, 3]\n", "concentrations[\"C\"] = sample_scaled[:, 4]\n", "\n", "state.set_conditions(temperatures, pressures)\n", "state.set_concentrations(concentrations)\n", "concentrations_solved = []\n", "time_step = 10 # s\n", "sim_length = 600 # s\n", "curr_time = 0 # s\n", "\n", "while curr_time <= sim_length:\n", " solver.solve(state, time_step)\n", " concentrations_solved.append(state.get_concentrations())\n", " curr_time += time_step" ] }, { "cell_type": "markdown", "id": "cd065d8a", "metadata": {}, "source": [ "## 5. Preparing the Results (Advanced; Optional Read)\n", "\n", "The intention of this code snippet is to split up each grid cell for each time step onto a separate row in the DataFrame so they can be averaged when plotted.\n", "\n", "This will produce a confidence interval for each time step since there will be a set of unique values at every time step for each species.\n", "\n", "As an example, if there are 2 grid cells and 5 time steps, the data table will have 10 rows, with the first being the first grid cell for the first time step, the second row being the second grid cell for the first time step, the third row being the first grid cell for the second time step, and so on.\n", "You can see this pattern in the displayed DataFrame as well (below).\n", "\n", "After the expansion of the DataFrame, adding the other columns is fairly straight-forward; they are simply given more rows since the number of rows is now the product of the number of grid cells and the number of time steps.\n", "The time column is notably different, however, since each time step has to be repeated for every grid cell.\n", "\n", "Due to the complexity of this code cell, it has been bundled into a function which is used below.\n", "Note that the first column is the index of the DataFrame, not the time step here." ] }, { "cell_type": "code", "execution_count": 13, "id": "92625aa8", "metadata": {}, "outputs": [], "source": [ "def convert_results_all_cells():\n", " concentrations_solved_pd = []\n", " time = []\n", " for i in range(0, sim_length + 1, time_step):\n", " for j in range(0, num_grid_cells):\n", " concentrations_solved_pd.append({species: concentration[j] for species, concentration in concentrations_solved[int(i/time_step)].items()})\n", " time.append(i)\n", " df = pd.DataFrame(concentrations_solved_pd)\n", " df = df.rename(columns = {'A' : 'CONC.A.mol m-3', 'B' : 'CONC.B.mol m-3', 'C' : 'CONC.C.mol m-3'})\n", " df['time.s'] = time\n", " df['ENV.temperature.K'] = np.repeat(temperatures[0], (sim_length/time_step + 1.0) * num_grid_cells)\n", " df['ENV.pressure.Pa'] = np.repeat(pressures[0], (sim_length/time_step + 1.0) * num_grid_cells)\n", " df['ENV.air number density.mol m-3'] = np.repeat(state.get_conditions()['air_density'][0], (sim_length/time_step + 1.0) * num_grid_cells)\n", " df = df[['time.s', 'ENV.temperature.K', 'ENV.pressure.Pa', 'ENV.air number density.mol m-3', 'CONC.A.mol m-3', 'CONC.B.mol m-3', 'CONC.C.mol m-3']]\n", " return concentrations_solved_pd, df" ] }, { "cell_type": "markdown", "id": "74d6d56f", "metadata": {}, "source": [ "Now, let's use the function we've written to create the DataFrame" ] }, { "cell_type": "code", "execution_count": 14, "id": "a338e7f6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | time.s | \n", "ENV.temperature.K | \n", "ENV.pressure.Pa | \n", "ENV.air number density.mol m-3 | \n", "CONC.A.mol m-3 | \n", "CONC.B.mol m-3 | \n", "CONC.C.mol m-3 | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "0 | \n", "317.0167635067038 | \n", "101722.6869975267 | \n", "38.59236650811341 | \n", "1.595583909208889 | \n", "1.585270886031598 | \n", "1.7065382493105852 | \n", "
| 1 | \n", "0 | \n", "317.0167635067038 | \n", "101722.6869975267 | \n", "38.59236650811341 | \n", "8.56059970344138 | \n", "9.126348120137955 | \n", "4.4624016818930174 | \n", "
| 2 | \n", "0 | \n", "317.0167635067038 | \n", "101722.6869975267 | \n", "38.59236650811341 | \n", "5.126079936709042 | \n", "6.8445384329146775 | \n", "6.763337737343939 | \n", "
| 3 | \n", "0 | \n", "317.0167635067038 | \n", "101722.6869975267 | \n", "38.59236650811341 | \n", "6.338825026981972 | \n", "2.2582074122408824 | \n", "6.241628537389399 | \n", "
| 4 | \n", "0 | \n", "317.0167635067038 | \n", "101722.6869975267 | \n", "38.59236650811341 | \n", "2.4635566912685825 | \n", "9.055516556581432 | \n", "2.959459728220327 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 6095 | \n", "600 | \n", "317.0167635067038 | \n", "101722.6869975267 | \n", "38.59236650811341 | \n", "0.12164507170374605 | \n", "0.547669866815287 | \n", "13.327782221911818 | \n", "
| 6096 | \n", "600 | \n", "317.0167635067038 | \n", "101722.6869975267 | \n", "38.59236650811341 | \n", "0.35817252241426667 | \n", "1.200242137056683 | \n", "17.767939194001215 | \n", "
| 6097 | \n", "600 | \n", "317.0167635067038 | \n", "101722.6869975267 | \n", "38.59236650811341 | \n", "0.32833988053461194 | \n", "1.105485210751788 | \n", "15.088075933807765 | \n", "
| 6098 | \n", "600 | \n", "317.0167635067038 | \n", "101722.6869975267 | \n", "38.59236650811341 | \n", "0.25341840006310173 | \n", "0.9485916367880122 | \n", "14.100160068090627 | \n", "
| 6099 | \n", "600 | \n", "317.0167635067038 | \n", "101722.6869975267 | \n", "38.59236650811341 | \n", "0.00855493275976946 | \n", "0.22565112923219416 | \n", "5.396154380948566 | \n", "
6100 rows × 7 columns
\n", "